Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for basic cursors and limits to LookupSubjects #1379

Closed

Conversation

josephschorr
Copy link
Member

@josephschorr josephschorr commented Jun 1, 2023

This change supports a limit (called the "concrete limit") on LookupSubjects and will filter concrete subjects based on the returned cursor.

This change does not filter intermediate lookups, which will be done in a followup PR.

@josephschorr josephschorr requested review from vroldanbet and a team June 1, 2023 21:33
@github-actions github-actions bot added area/api v1 Affects the v1 API area/dependencies Affects dependencies area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Jun 1, 2023
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch 2 times, most recently from da48a6b to 19099a2 Compare June 1, 2023 21:50
@josephschorr josephschorr marked this pull request as draft June 20, 2023 23:38
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch 3 times, most recently from 21ae960 to c386a05 Compare July 12, 2023 19:36
@josephschorr josephschorr marked this pull request as ready for review July 12, 2023 19:58
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time to finish the review, sorry! This is my first attempt. So far it's looking good, although I have a bunch of questions.

Something I noticed is that when one resumes with a cursor, we are evaluating the whole graph again and sending queries to the database, even though they will return empty. It works, but is a lot of work that is wasted after resuming, and will be directly proportional to the complexity of the schema. Ideally the evaluation of the schema can also resume from where the cursor left off, e.g. if permission view = a + b + c you'd restart evaluation at c if that's where you left when the limit was hit. It would spare a bunch of dispatching (network calls across SpiceDBs!) and DB roundtrips.

internal/graph/lookupsubjects.go Show resolved Hide resolved
Comment on lines 513 to 592
excludedSubjectIDs := make([]string, 0, len(foundSubject.ExcludedSubjects))
for _, excludedSubject := range foundSubject.ExcludedSubjects {
excludedSubjectIDs = append(excludedSubjectIDs, excludedSubject.SubjectId)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we merge this into the loop below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did so, but I'm thinking maybe we just remove it entirely now? The field has been marked deprecated for a number of versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me!

internal/services/v1/permissions.go Outdated Show resolved Hide resolved
if subject == nil {
continue
}
encodedCursor, err := cursor.EncodeFromDispatchCursor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to allocate a new encoded cursor each time here? Could we reuse the same proto instance and just modify the corresponding field? I did something as an optimization in the ReadRelationships streaming bits and helped a bunch with allocations and GC overhead. It would also spare us calling revision.String() repeatedly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be far less maintainable, but I'll see what I can do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but its a bit ugly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a second pass, I thought the new code could have a race. Perhaps it's not worth the trouble? I'll leave it up to you

internal/services/v1/permissions.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
for _, foundSubjects := range subjects {
for _, foundSubject := range foundSubjects.FoundSubjects {
// NOTE: wildcard is always returned, because it is needed by all branches, at all times.
if foundSubject.SubjectId == tuple.PublicWildcard || (afterSubjectID == "" || foundSubject.SubjectId > afterSubjectID) {
Copy link
Contributor

@vroldanbet vroldanbet Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in not sure to understand how we guarantee that we do not skip found subjects that are alphabetically after the current cursor but that haven't been seen by the stream before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the stream always returns in sorted order from the root. That way, we are always guaranteed to have a defined ordering (alphabetical) coming out of each subproblem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that mean that we have to start collecting subjects from all subproblems before we can start streaming? We can make sure that each subproblem retrieves results in order from DB, but you still need to execute all of them in order to determine what's the first subject to stream?

internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
pkg/genutil/slicez/chunking_test.go Outdated Show resolved Hide resolved
pkg/genutil/slicez/chunking_test.go Outdated Show resolved Hide resolved
internal/services/v1/permissions_test.go Outdated Show resolved Hide resolved

dispatchCount, err := responsemeta.GetIntResponseTrailerMetadata(trailer, responsemeta.DispatchedOperationsCount)
req.NoError(err)
req.GreaterOrEqual(dispatchCount, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there had been some regressions recently around dispatches and I wonder if we could use a new test-case that checks the number of dispatches over a cursored LR call

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean over the LS call? We do now have the test for LR calls

Copy link
Contributor

@vroldanbet vroldanbet May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean testing for LS. The logic should be different enough that it warrants another set of tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Added

internal/datasets/subjectsetbytype.go Show resolved Hide resolved
internal/datasets/basesubjectset.go Outdated Show resolved Hide resolved
internal/dispatch/keys/computed.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
}

// filterSubjectsMap filters the subjects found in the subjects map to only those allowed, returning an updated map.
func filterSubjectsMap(subjects map[string]*v1.FoundSubjects, allowedSubjectIds ...string) map[string]*v1.FoundSubjects {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no use of the variadic allowedSubjectIds so please pass as a slice. NewSet should also support a slice - a lot of calls in the codebase that end up doing ... just to pass it to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed here but not NewSet. We have some places where the variadic is helpful. If you like, I can add another NewSetFromSlice and change the call sites?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I think that'd be a good idea

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and changed all the call sites

internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
internal/graph/lookupsubjects_test.go Show resolved Hide resolved
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from c386a05 to 137e08e Compare September 6, 2023 17:57
@github-actions github-actions bot removed the area/dependencies Affects dependencies label Sep 6, 2023
@josephschorr
Copy link
Member Author

Updated

@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from 137e08e to 1dec055 Compare March 11, 2024 17:40
@josephschorr
Copy link
Member Author

Rebased

@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch 2 times, most recently from 3202a4b to b00913b Compare March 11, 2024 20:36
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from b00913b to 83a96b2 Compare March 25, 2024 13:11
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from 83a96b2 to 39f033f Compare April 26, 2024 19:13
@josephschorr
Copy link
Member Author

Rebased

@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch 2 times, most recently from 57a79b9 to 07dbacd Compare May 14, 2024 15:48
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a complete re-review to load all the context. First batch of feedback.

internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Outdated Show resolved Hide resolved
excludedSubjectIDs := make([]string, 0, len(foundSubject.ExcludedSubjects))
for _, excludedSubject := range foundSubject.ExcludedSubjects {
excludedSubjectIDs = append(excludedSubjectIDs, excludedSubject.SubjectId)
for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handling stream should handle all the elements up to the concrete limit, wouldn't it? From looking at the code, it would seem the only sole purpose is making sure the wildcard is sent, since in the worst-case scenario we stream up to the concrete limit, and we have to handle the potential +1 of the wildcard. Is that correct?

internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Show resolved Hide resolved
internal/services/v1/permissions.go Outdated Show resolved Hide resolved
pkg/proto/dispatch/v1/dispatch_vtproto.pb.go Outdated Show resolved Hide resolved
Copy link
Contributor

@vroldanbet vroldanbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josephschorr last batch of feedback. Let's address this quickly so we don't loose the context and we can land it soon.

internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved
internal/graph/lookupsubjects.go Outdated Show resolved Hide resolved

dispatchCount, err := responsemeta.GetIntResponseTrailerMetadata(trailer, responsemeta.DispatchedOperationsCount)
req.NoError(err)
req.GreaterOrEqual(dispatchCount, 0)
Copy link
Contributor

@vroldanbet vroldanbet May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean testing for LS. The logic should be different enough that it warrants another set of tests?

if subject == nil {
continue
}
encodedCursor, err := cursor.EncodeFromDispatchCursor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a second pass, I thought the new code could have a race. Perhaps it's not worth the trouble? I'll leave it up to you

Comment on lines 513 to 592
excludedSubjectIDs := make([]string, 0, len(foundSubject.ExcludedSubjects))
for _, excludedSubject := range foundSubject.ExcludedSubjects {
excludedSubjectIDs = append(excludedSubjectIDs, excludedSubject.SubjectId)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me!

internal/graph/lookupsubjects_test.go Show resolved Hide resolved
internal/graph/lookupsubjects.go Show resolved Hide resolved
@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from 07dbacd to 0c9fbb7 Compare May 17, 2024 21:05
@josephschorr
Copy link
Member Author

Updated

@josephschorr josephschorr force-pushed the cursored-lookup-subjects branch from 0c9fbb7 to 1f765d0 Compare May 20, 2024 18:33
@github-actions github-actions bot added the area/datastore Affects the storage system label May 20, 2024
This change supports a limit (called the "concrete limit") on LookupSubjects and will filter concrete subjects based on the returned cursor.

This change does *not* filter intermediate lookups, which will be done in a followup PR.
@josephschorr
Copy link
Member Author

Moved to draft due to steelthread discovering a pagination bug

@karlwang1983
Copy link

@vroldanbet Is there any progress for this pr, we are still looking forward the pagination function of LookupSubjects.

@vroldanbet
Copy link
Contributor

@karlwang1983 I will defer to @josephschorr, my role has been to review the code.

@mateenkasim
Copy link

Also looking forward to this one!

@josephschorr
Copy link
Member Author

Closing, as we will be taking a different approach to LookupSubjects pagination given that the current approach does not work as we intended

@github-actions github-actions bot locked and limited conversation to collaborators Jan 6, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/api v1 Affects the v1 API area/datastore Affects the storage system area/dispatch Affects dispatching of requests area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants